{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d8814e38-0775-4e94-b88e-af060046e3b0",
   "metadata": {},
   "source": [
    "# Multi Document Agents Pack\n",
    "\n",
    "This LlamaPack provides an example of our multi-document agents."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "607f692e-b78d-420e-8593-7ce2c775568a",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index llama-hub"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "026f0383-43b2-43e6-8885-c6ede3107054",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3554118-992a-437e-9b8b-b95771067b5b",
   "metadata": {},
   "source": [
    "### Setup Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c3fd0a1c-a80b-4dc2-bd6a-f559637d7d39",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pathlib import Path\n",
    "import requests\n",
    "from llama_index.core import SimpleDirectoryReader\n",
    "\n",
    "wiki_titles = [\n",
    "    \"Toronto\",\n",
    "    \"Seattle\",\n",
    "    \"Chicago\",\n",
    "    \"Boston\",\n",
    "    \"Houston\",\n",
    "    \"Tokyo\",\n",
    "    \"Berlin\",\n",
    "    \"Lisbon\",\n",
    "    \"Paris\",\n",
    "    \"London\",\n",
    "    \"Atlanta\",\n",
    "    \"Munich\",\n",
    "    \"Shanghai\",\n",
    "    \"Beijing\",\n",
    "    \"Copenhagen\",\n",
    "    \"Moscow\",\n",
    "    \"Cairo\",\n",
    "    \"Karachi\",\n",
    "]\n",
    "\n",
    "for title in wiki_titles:\n",
    "    response = requests.get(\n",
    "        \"https://en.wikipedia.org/w/api.php\",\n",
    "        params={\n",
    "            \"action\": \"query\",\n",
    "            \"format\": \"json\",\n",
    "            \"titles\": title,\n",
    "            \"prop\": \"extracts\",\n",
    "            # 'exintro': True,\n",
    "            \"explaintext\": True,\n",
    "        },\n",
    "    ).json()\n",
    "    page = next(iter(response[\"query\"][\"pages\"].values()))\n",
    "    wiki_text = page[\"extract\"]\n",
    "\n",
    "    data_path = Path(\"data\")\n",
    "    if not data_path.exists():\n",
    "        Path.mkdir(data_path)\n",
    "\n",
    "    with open(data_path / f\"{title}.txt\", \"w\") as fp:\n",
    "        fp.write(wiki_text)\n",
    "\n",
    "# Load all wiki documents\n",
    "city_docs = {}\n",
    "for wiki_title in wiki_titles:\n",
    "    docs = SimpleDirectoryReader(input_files=[f\"data/{wiki_title}.txt\"]).load_data()\n",
    "    desc = f\"This is a Wikipedia article about the city of {wiki_title}\"\n",
    "    city_docs[wiki_title] = (wiki_title, desc, docs[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5169109b-e361-4d75-842a-fdc9992bb6e6",
   "metadata": {},
   "source": [
    "### Download and Initialize Pack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "86648344-32aa-4190-99e0-7c3e17304234",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.llama_pack import download_llama_pack\n",
    "\n",
    "MultiDocumentAgentsPack = download_llama_pack(\n",
    "    \"MultiDocumentAgentsPack\",\n",
    "    \"./multi_doc_agents_pack\",\n",
    "    # leave the below commented out (was for testing purposes)\n",
    "    # llama_hub_url=\"https://raw.githubusercontent.com/run-llama/llama-hub/jerry/add_llama_packs/llama_hub\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "58f3cf27-1f82-42a7-a52c-5f06a34ba7c3",
   "metadata": {},
   "outputs": [],
   "source": [
    "city_tups = list(city_docs.values())\n",
    "multi_doc_agents_pack = MultiDocumentAgentsPack(\n",
    "    [t[2] for t in city_tups], [t[0] for t in city_tups], [t[1] for t in city_tups]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "71617ada-61f8-4ee5-ae43-f9ffe58ac700",
   "metadata": {},
   "source": [
    "### Run Pack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "881cf499-cac0-41c1-8e7e-dc0e91a72473",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "STARTING TURN 1\n",
      "---------------\n",
      "\n",
      "=== Calling Function ===\n",
      "Calling function: tool_Houston with args: {\n",
      "  \"input\": \"demographics\"\n",
      "}\n",
      "STARTING TURN 1\n",
      "---------------\n",
      "\n",
      "=== Calling Function ===\n",
      "Calling function: vector_tool with args: {\n",
      "  \"input\": \"demographics\"\n",
      "}\n",
      "Got output: Houston has a population of 2,304,580 according to the 2020 U.S. census. In 2017, the estimated population was 2,312,717, and in 2018 it was 2,325,502. The city has a diverse demographic makeup, with a significant number of undocumented immigrants residing in the Houston area, comprising nearly 9% of the city's metropolitan population in 2017. The age distribution in Houston includes a significant number of individuals under 15 and between the ages of 20 to 34. The median age of the city is 33.4. The city has a mix of homeowners and renters, with an estimated 42.3% of Houstonians owning housing units. The median household income in 2019 was $52,338, and 20.1% of Houstonians lived at or below the poverty line.\n",
      "========================\n",
      "\n",
      "STARTING TURN 2\n",
      "---------------\n",
      "\n",
      "Got output: Houston has a population of 2,304,580 according to the 2020 U.S. census. The city has a diverse demographic makeup, with a significant number of undocumented immigrants residing in the Houston area, comprising nearly 9% of the city's metropolitan population in 2017. The age distribution in Houston includes a significant number of individuals under 15 and between the ages of 20 to 34. The median age of the city is 33.4. The city has a mix of homeowners and renters, with an estimated 42.3% of Houstonians owning housing units. The median household income in 2019 was $52,338, and 20.1% of Houstonians lived at or below the poverty line.\n",
      "========================\n",
      "\n",
      "STARTING TURN 2\n",
      "---------------\n",
      "\n",
      "=== Calling Function ===\n",
      "Calling function: tool_Chicago with args: {\n",
      "  \"input\": \"demographics\"\n",
      "}\n",
      "STARTING TURN 1\n",
      "---------------\n",
      "\n",
      "=== Calling Function ===\n",
      "Calling function: vector_tool with args: {\n",
      "  \"input\": \"demographics\"\n",
      "}\n",
      "Got output: Chicago experienced rapid population growth during its first hundred years, becoming one of the fastest-growing cities in the world. From its founding in 1833 with fewer than 200 people, the population grew to over 4,000 within seven years. By 1890, the population had surpassed 1 million, making Chicago the fifth-largest city in the world at the time. The city's population continued to grow, reaching its highest recorded population of 3.6 million in 1950. However, in the latter half of the 20th century, Chicago's population declined, dropping to under 2.7 million by 2010. The city experienced a rise in population for the 2000 census, followed by a decrease in 2010, and then another increase for the 2020 census. According to U.S. census estimates as of July 2019, the largest racial or ethnic groups in Chicago are non-Hispanic White (32.8%), Blacks (30.1%), and Hispanics (29.0%). Additionally, Chicago has the third-largest LGBT population in the United States, with an estimated 7.5% of the adult population identifying as LGBTQ in 2018.\n",
      "========================\n",
      "\n",
      "STARTING TURN 2\n",
      "---------------\n",
      "\n",
      "Got output: Chicago has seen significant population growth since its founding in 1833, with the population reaching over 4,000 within seven years. By 1890, the population had surpassed 1 million, making Chicago the fifth-largest city in the world at the time. The city's population peaked at 3.6 million in 1950, but has since declined to under 2.7 million by 2010. However, there has been a slight increase in the population according to the 2020 census. \n",
      "\n",
      "As of July 2019, the largest racial or ethnic groups in Chicago are non-Hispanic White (32.8%), Blacks (30.1%), and Hispanics (29.0%). Additionally, Chicago has the third-largest LGBT population in the United States, with an estimated 7.5% of the adult population identifying as LGBTQ in 2018.\n",
      "========================\n",
      "\n",
      "STARTING TURN 3\n",
      "---------------\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# this will run the full pack\n",
    "response = multi_doc_agents_pack.run(\n",
    "    \"Tell me the demographics of Houston, and then compare with the demographics of Chicago\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9b4f72c6-7ea5-4faa-8963-2fc89c6694a4",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "During his time in YC, the author worked with a new batch of startups every six months. He engaged with the problems faced by these startups and worked hard to help them. He mentioned that there were parts of the job he didn't like, such as disputes between cofounders and dealing with people who maltreated the startups. However, he worked hard even at the parts he didn't like. The author also mentioned that he wanted YC to be successful, so he worked very hard, setting an example for others.\n"
     ]
    }
   ],
   "source": [
    "print(str(response))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5c4f0ed8-032a-436b-b41d-0970eb516943",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": null,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(response.source_nodes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23ed9a46-7bf4-45ac-a43a-403d531f98c2",
   "metadata": {},
   "source": [
    "### Inspect Modules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8cd60ba2-78d0-4941-ac82-fc0926f6855f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'vector_retriever': <llama_index.core.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x2bc0e76d0>,\n",
       " 'fusion_retriever': <llama_index.core.retrievers.fusion_retriever.QueryFusionRetriever at 0x2bc0e7eb0>,\n",
       " 'query_engine': <llama_index.core.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x2891c5990>}"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "modules = multi_doc_agents_pack.get_modules()\n",
    "display(modules)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_hub",
   "language": "python",
   "name": "llama_hub"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
